AIbase
Home
AI Tools
AI Models
MCP
AI NEWS
EN
Model Selection
Tags
Reinforcement learning inference

# Reinforcement learning inference

Deepseek R1 Distill Qwen 32B Unsloth Bnb 4bit
Apache-2.0
DeepSeek-R1 is the first-generation inference model launched by the DeepSeek team. Through large-scale reinforcement learning training, it does not require supervised fine-tuning (SFT) as an initial step and demonstrates excellent inference capabilities.
Large Language Model Transformers English
D
unsloth
938
10
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
English简体中文繁體中文にほんご
© 2025AIbase